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Abstract 

We develop a framework that we call compressive rate estimation. We assume that the composite channel 
gain matrix (i.e. the matrix of all channel gains between all network nodes) is compressible which means it can 
be approximated by a sparse or low rank representation. We develop and study a novel sensing and reconstruction 
protocol for the estimation of achievable rates. We develop a sensing protocol that exploits the superposition 
principle of the wireless channel and enables the receiving nodes to obtain non-adaptive random measurements 
of columns of the composite channel matrix. The random measurements are fed back to a central controller that 
decodes the composite channel gain matrix (or parts of it) and estimates individual user rates. We analyze the 
rate loss for a linear and a non-linear decoder and find the scaling laws according to the number of non-adaptive 
measurements. In particular, if we consider a system with N nodes and assume that each column of the composite 
channel matrix is k sparse, our findings can be summarized as follows. For a certain class of non-linear decoders 
we show that if the number of pilot signals M scales like M ~ k\og(N/k), then the rate loss compared to 
perfect channel state information remains bounded. For a certain class of linear decoders we show that the rate 
loss compared to perfect channel state information scales like 1 j\[M. 


I. Introduction 

Device-to-device (D2D) communication has evolved as one of the key technology enablers for 5G wireless 
systems (“Beyond 2020 Networks”) Q). The basic idea of D2D communication is to establish direct short- 
distance communication links between pairs of suitably selected wireless devices so that there is no need for 
long-distance transmissions to and from base stations (BS). Exploiting direct communication between nearby 
devices has a huge potential for boosting the performance of cellular networks E) and improving the service 
quality of proximity based applications 0. In addition, D2D communication makes some new exciting location- 
based services and applications possible. 
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The main potential advantages of D2D communications stem from the proximity-, reuse-, and hop gains that 
can be summarized as follows H: 

• Coverage improves since direct D2D link|3 can be used to fill coverage holes; 

• Capacity enhances due to the reuse of radio resources of the supporting cellular layer by multiple D2D 
links 0; 

• Energy efficiency increases since transmit powers can be reduced without deteriorating the capacity El; 

• Achievable peak rates increase and end-to-end latencies decrease due to proximity and hop gains. 

D2D communication has been extensively studied in the context of ad-hoc networks, in which wireless devices 
utilize unlicensed spectrum resources with no or strictly limited assistance from a fixed network infrastructure. 
Such solutions are not suitable for general purpose wireless applications due to the lack of quality-of-service 
(QoS) guarantees to D2D links (7). This is also true in the case of other approaches to D2D communication that 
are based on the concept of cognitive radio and dynamic/opportunistic spectrum access f8j. Therefore, these 
approaches have found limited acceptance in the standardization bodies. 

In order to overcome the limits of unassisted ad-hoc networking technologies and opportunistic spectrum 
access technologies based on spectrum sensing, researchers have recently turned their attention towards network- 
assisted D2D communication, which promises more efficient spectrum utilization, QoS support and higher 
reliability, while providing D2D discovery support, synchronization and security J2), (4j, (5). In particular, the 
design aspects of D2D communication are currently discussed in 3GPP, where the feasibility and the architecture 
enhancements of so called proximity services (ProSe) are under discussion GO, Ql. Thereby, D2D links can 
operate in in-band mode and out-band mode. While the in-band D2D mode utilizes the same spectral resources 
as cellular users that transmit their data via base stations in the traditional cellular mode, the out-band D2D mode 
allocates cellular users and D2D links to different frequency bands. We focus on in-band D2D communication 
and assume that all users are in-coverage, which means that each user is connected to some base station^ As 
an underlay to cellular networks, in-band D2D communication can be seen as a network-assisted interference 
channel, in which D2D transmissions reuse cellular resources while being assisted by base stations. 

Despite key advantages, network-assisted D2D communication also poses some fundamental challenges 
including transmission mode selection, robust interference management and feedback design. The underlying 
problems are aggravated by the lack of channel state information (CSI) at different locations in a network |j 
There is in particular a vital need for timely and accurate CSI that can be used by the network controller to 
facilitate reliable D2D discovery and QoS-aware scheduling. In other words, when establishing D2D links and 
allocating cellular resources to them, the network controller should have enough CSI to ensure that the QoS 
demands of all cellular and D2D users (e.g. expressed in terms of some minimum data rate requirements) are 
guaranteed once in-band D2D links are established. While being highly valuable, CSI is not for free and must 
be obtained as efficient as possible without consuming to much scarce radio resources. In ED the authors used 

1 We refer the reader to Section [II] for more details about the terminology used throughout the paper. 

2 Nonetheless, we point out that most of the proposed methods and concepts can be extended to enable D2D communication for out-of- 
coverage users. 

3 Notice that CSI is used in a broad sense here and does not necessarily mean the full channel knowledge. In particular, CSI may also 
refer to the information about achievable rates. 
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methods from compressed sensing to acquire channel state information at the central controller of two-hop 
network. 

A. Our Contribution 

This paper contributes towards the development of measurement-based feedback protocols, with the goal of 
enabling a network controller to acquire the required CSI in a highly efficient way. Such protocols need to 
perform the following steps IITZTl : 

• Spectral resource management : The BS assigns cellular users to the available spectral resources. This step 
is performed in any cellular network with centralized resource management, e.g., 3GPP LTE. 

• D2D discovery and mode selection: The BS detects wireless devices that are in proximity to each other 
(D2D discovery) and decides if a device should operate in cellular mode or D2D mode (mode selection). 

• Pairing: The network controller decides if one or more D2D links share a spectral resource with some 
cellular user. 

The focus of this paper is on D2D discovery - also called proximity discovery - and on pairing, which is a 
part of scheduling decisions that assign resources to cellular users and D2D links. Both tasks - D2D discovery 
and pairing - are entirely carried out by a network controller where enough CSI is needed for robust decisions. 
Assuming D2D communication as an underlay to a cellular network, we address the problem of reliable 
D2D discovery and pairing based on compressed and quantized channel measurements. We develop and study 
a novel sensing and reconstruction strategy (protocol) for the estimation of achievable rates, which we call 
compressive rate estimation. The proposed protocol combines the estimation from compressed measurements 
with coded access to reduce the number of pilot-based measurements that need to be fed back to estimate the 
achievable rates and to make timely and robust QoS-aware decisions. By using the concept of coded access 
we are able to exploit collisions in an interference channel to obtain compressed non-adaptive measurements 
from linear random projections (e.g. analog coding of ma can be used for this purpose). To estimate the 
rates, we apply methods from compressed sensing and sparse approximation na. Since a major drawback of 
compressed sensing based techniques is that they require highly complex decoders, we also consider linear 
estimation methods which require significantly less complexity 02). As we will see, the advantages of the 
proposed protocol are three-fold. First, by applying the concept of coded access, we are able to significantly 
reduce the pilot contamination in the network. Second, the feedback overhead is reduced since significantly 
fewer measurements need to be quantized and fed back. Third, most of the complexity required to estimate the 
achievable rates is imposed on the network controller. 

B. Notation 

The element in the i-th row and j-th column of a matrix X is given by [X]ij = x t _ 3 , similarly, the y'-th 
element of a vector x is given by x,. The conjugate transpose of a matrix X is X 11 . For vectors the f p -norm is 
given by ||a;||^ p = (£A ,p > 1. For matrices the Schatten-p norm is given by ||X|| Sp = (£A crf(X)) 1 ^ 

where {oi(X)}i are the singular values of the matrix X in decreasing order. The operator x = vec(X) stacks 
the columns of the matrix X in a large column vector x. The support supp(tc) of a vector x is the index set 
of its non-zero elements. The N x N identity matrix is denoted as Jjv and its *-th column is defined as e,. 
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Tuples are denoted by calligraphic letters and the i -th element of tuple X is given by X,. The real numbers 
are defined as R and the complex number are C. 


II. System Model 


We consider a cellular network with a large number of wireless devices and multiple base stations that are 
controlled by a (central) network controller. We assume there are N > 1 transmitters that wish to establish 
communication links over the (wireless) channel to transfer independent data to N receivers^] Communication 
links between the wireless devices and the base stations are referred to as cellular users , while the term D2D 
user or, equivalently, D2D link is used to refer to a communication link between two wireless devices. The 
users as well as the corresponding transmitters and receivers are indexed in an arbitrary but fixed order with 
indices taken from the set A f = {1,2,... 7V}|f| A subset A/) C A/ is used to denote cellular users so that the 
remaining users with indices in AT \ A/i are potential D2D users. The cellular users are assumed to have been 
scheduled for (cellular) transmissions in the downlink channel. 

A frequency-division multiple access (FDMA) technique such as OFDMA (OFDMA: orthogonal frequency- 
division multiple access) together with a time-division multiple access (TDMA) technique is used to divide 
the available bandwidth and time in a number of mutually orthogonal time-frequency resource units referred 
to as resource blocks. We assume that the bandwidth and the duration of each resource block are smaller than 
the coherence bandwidth and the coherence time of the channel, respectively. This implies that the channel for 
each resource block and each user can be considered to be frequency flat and constant. More precisely, the 
channel from the transmitter of user j (referred to as transmitter j ) to the receiver of user i (called receiver i) 
on resource block (t, f) is described by the channel coefficient hij(t,f) £ C, which is a realization of some 
stochastic process. We assume that all resource blocks are statistically equivalent and independent. Therefore, 
we can consider an arbitrary but fixed resource block and drop the time and frequency index for simplicity. 

Given a resource block, user i £ AT may experience interference from other users j £ A f,j ^ i. As a result, 
the performance of user i £ J\T depends in general on the vector hi := (hip ,..., hi : .\') T £ C N of channel 
coefficients h t j £ C from all transmitters j £ AT to receiver i £ AT. These channel vectors are grouped in the 
channel matrix H := (hi, ..., hjv) which contains all channel coefficients. 

As discussed before, not all potential D2D users in A/ r \A/i need to be scheduled for transmissions. Therefore, 
we define S C AC to be the index set of users (cellular and D2D) scheduled for transmissions. The signal 
observed by receiver i £ S is then 


Ui — hi iSi ~f hijSj ~ f- Tti, 


( 1 ) 


where Sj £ C is the complex data symbol transmitted by node j and rii ~ CAf( 0, of) is additive noise at receiver 
i. The transmitted data symbols are assumed to be i.i.d. random variables with E [sj] = 0 and E [|sj | 2 ] = pj. 


4 For simplicity, the reader may assume unidirectional communication links throughout the paper but we point out that the results can 
be straightforwardly extended to bidirectional links. 

5 We also use ./V* to refer to transmitters, receivers and transmissions (i.e., users scheduled for transmissions). According to this, 
transmission i g AT is the transmission from transmitter i € A/" to receiver i G J\f 


April 29, 2015 


DRAFT 




5 


where the transmit power pj of user j is assumed to be fixed (i.e. we consider no power control). If user i is 
scheduled for transmission, then its achievable rate is assumed to bq^ 


r{hi,S) =log(H-SINR(hi,5)) 


( 2 ) 


where the SINR of receiver i £ S is defined as the ratio of the desired signal power to the sum of the interference 
and noise power: 


SINR (hi,S) := 


Pi | hi,i 


+E j& S\{z}Pj\ h ij\ 2 ' 


(3) 


In what follows, we assume that each receiver i has a rate (or quality-of-service) requirement r, and we define 
a feasible scheduling decision as follows. 


Definition 1 (Feasible scheduling decision). Given a channel matrix H, we say that a scheduling decision S 
is feasible if Afi C S and r(hi,S) > f, holds for each i £ S C Af. 

We emphasize that by the definition, r(hi,S) > fi for each i £ Af\ C S whenever S is feasible. In other 
words, the requirements of cellular users are satisfied per definition and Afi is a feasible scheduling decision. 
As far as the potential D2D users in Af\ Afi are concerned, the network controller may schedule them to be 
paired with the transmissions in Afi, provided that (i) D2D devices are in proximity to each other (see below) 
and (ii) the resulting scheduling decision is feasible in the sense of Def. Q] 


A. D2D discovery and pairing with perfect CSI 

As mentioned in the introduction, two main steps towards establishing a D2D communication are D2D 
discovery - also called proximity discovery - and pairing. First we need to define the notion of proximity. 

Definition 2 (Proximity). Given a channel realization, we say that two wireless devices are in proximity to 
each other if the interference-free channel between them is good enough to fulfill a given rate requirement. 

In other words, proximity is necessary (but not sufficient) for establishing a D2D link between two devices 
and D2D discovery is a process of identifying D2D candidates out of all potential D2D users. Ideally, D2D 
discovery (and also pairing) should be based on the achievable rates. If the network controller had namely 
perfect knowledge of the channel matrix H, it could compute the achievable rates rih,. S).i £ S, for all 
feasible scheduling decision S C A f. Thus, D2D discovery can be performed as follows. 

Definition 3 (D2D discovery with perfect CSI). Assuming that the network controller has perfect knowledge 
of hi for some i £ Af \ Af -[, transmitter % and receiver i are said to be in proximity (to each other) if i £ Af -2 
where 

Af 2 = {i £ Af\Afi : r(hi,{i}) >fi} C Af. (4) 

Therefore, Af 2 is the set of all D2D candidates. 

6 Note that we could assume any strictly increasing function / : R_|_ i —> R_|_ with /(0) = 0 and lim x —>.oo f(x) = + 00 . 
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After performing D2D discovery, the network controller decides if D2D candidates in A 2 are paired for 
transmissions with the cellular users specified by Ai to establish D2D links. The optimal scheduling decision 
is found as follows. 


Definition 4 (Optimal pairing decision with perfect CSI). Under the assumption of perfect CSI at the network 
controller, an optimal scheduling decision S C A/"i U A 2 (that involves pairing decision) is a solution to 


max V r(hi, X U Ai) 
x<zss 2 ^ 

iGAUA/i (5) 

subject to r{hi , X U Ai) > r, for all i £ X U Ai. 

Since M\ is assumed to be feasible decision scheduling, the problem in ([5]) has always a solution in the sense 
that if no D2D candidate can be paired with the cellular users, then S = Aj is the feasible scheduling decision. 
Note that since Ai is given, solving the pairing decision problem provides a feasible scheduling decision S. 


III. Rate Estimation Based on Compressed Measurements 

One of the central tasks of the network controller is to perform reliable D2D discovery and pairing decisions. 
Here reliability is to be understood in terms of the rate requirements of all users which need to be satisfied 
permanently. In other words, the resulting scheduling decisions S must be feasible in accordance with Def. Q] 
in spite of the lack of perfect CSI. By Def. [3] and Def. [I] it is clear that reliable D2D discovery and reliable 
pairing decisions require accurate estimates of the achievable rates r(hi,S) for any feasible scheduling decision 
S. Therefore, accurate CSI is a crucial ingredient in the design of reliable communication systems. 

In this section, we introduce a channel measurement and feedback protocol together with different decoders 
that enables the central controller to reliably estimate the achievable rates at relatively low overhead costs. The 
measurement and rate estimation protocol is summarized in Table Q] 

TABLE I 

Measurement and rate estimation protocol. 


Network controller 

Transmit synchronization signal. 

Transmitters 

Transmit sequences of M pilot signals. 

Receivers 

Measure supeipositions of pilot signals. 


Quantize measurements and feed them back to the network controller. 

Network controller 

Estimate rates based on quantized compressed linear measurements. 


Perform D2D discovery and make pairing/scheduling decision 


A. Random Channel Measurement 

To reduce the signaling and coordination overhead for channel measurements, all transmitters simultaneously 
transmit M > 1 pilot signals. In what follows, we use (f>- £ C A/ to denote the pilot signals sent by transmitter 
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i, which is the ith column of the so-called measurement matrix denoted by $ = (<f > 1 ,..., f> N ) £ C Mx N . Then, 
according to 0, the vector of all M signals observed by receiver i can be written as 

= $hi + n, £ C M i £ Af. (6) 

Each receiver, say receiver i £ Af, quantizes the vector of channel measurements y i using a quantization 
operator Q : C M —>■ C A1 and feed back the quantized values to the network controller. For simplicity, we make 
the following assumption 

Assumption 1. We assume that Q{yf) = y, + hi, where n, is additive noise independent of y i . Furthermore, 
we assume an error and delay free feedback channel from all nodes to the network controller. 

By the assumption, the CSI at the network controller is 

2* = f{Vi) + hi = $>hi + (7) 

where //,■ := n, + n, is an additive noise term that contains the measurement and quantization noise. Further 
we denote the matrix of all quantized channel measurements, which is known to the network controller, by 

Z:= (zi,...,z N ) £ C MxN . 

B. Channel gain estimators 

Given random channel measurements as described in the previous subsection, the goal is to estimate CSI in 
the sense of minimizing the gap between the achievable rates based on perfect CSI and their estimates. To be 
precise, let z t be compressed and quantized CSI from receiver i given by ©, and let ii(z t . j) be a deterministic 
function that estimates the channel gain \hij\ 2 . Hence, 

\hij\ 2 ■■= P{zi, j) , i,j£Af, (8) 

where hi := ..., £ C N is an estimate of hi in the sense of ©. By ©, the achievable rates are 

proportional to the SINR, which in turn is a function of the channel gains \hij\ 2 . As a result, it is sufficient 
to estimate the channel gains instead of the complex channel coefficients. 

In this paper, we consider different channel gain estimators specified by the functions f3(zi,j). One class of 
function is given by channel gain estimation functions which are linear in the complex coefficients: 

Definition 5 (Finear channel gain estimator). Given the CSI Zi defined by (0, a linear channel gain estimation 
function (for the channel coefficient /i ? J ) is given by 

/3i(z i: j) = \(^z i ,e j }\ 2 , (9) 

where the matrix 'I' £ C ;V x ' 1 depends on the measurement matrix $ and e, t is the j th column of the identity 
matrix In- 

Another class of estimation functions is referred to as non-linear channel gain estimation functions'. 

Definition 6 (Non-linear channel gain estimator). Given the CSI Zi defined by 0, a non-linear channel gain 
estimation function is given by 

Pni{zi,j) = \{a(zi),ej)\ 2 
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where a : C M —r C N is some predefined non-linear function. 


C. Problem Statement: D2D discovery and pairing with imperfect CSI 

The estimated achievable rate r of user i £ AT can be seen as a function of z t . Therefore, given hi and Zi, 
the rate gap of user i depends on a scheduling decision S, and is defined to be 


Ai(5) := \r(zi,S) - r(h u S)\, i £S 


( 10 ) 


where the achievable rate r is given by © and 

r(zi,S ) = log ( 1 + 


/3 (zi,i)pi 


( 11 ) 


! + E jes\{i}P( z id)Pj ) ' 

Here and hereafter f3(zi,i) is defined by © and is the estimated rate for given z, and a scheduling decision 
S. For the ease of notation, in what follows, we write A, := A,(A) if S is clear from the context. We use 
A, ({i}) as a basis for D2D discovery because it is the rate gap of user i £ Af \Afi in an interference-free 
scenario. The rationale behind the definition of rate gap in (flOt comes from the rate requirements. In particular, 
if we have A,(A) < e for some known e > 0 and an arbitrary feasible S, then the network controller is able 
to reliably perform D2D discovery and pairing. 

To see this, let us first consider the problem of D2D discovery based on compressed and quantized CSI 
Zi £ C M . We assume that the network controller can upper bound the rate gap such that A^ ({*}) < e,i £ Af\Af\ 
for some e > 0. It may be easily verified that, under this assumption, the condition {?'}) > f+ £ implies 
proximity so that r(hi , {*}) > fj. As a result. 


J\f 2 = {i £ Af\Afi : r(zi, {*}) >n + e}CAf 2 


( 12 ) 


is a set of device pairs that are in proximity to each other (see Def. ©, and therefore are D2D candidates. So 
the network controller is able to reliably identify a subset of D2D candidates, provided that it can upper bound 
the rate gap Aj({*}). Notice that the cardinality |A/ 2 1 of A4 is non-increasing in e and |A4| —► 0 as e —> 00 . 
This means that e should be as small as possible to discover and identify as many potential D2D users defined 
by © as D2D candidates. In other words, we need a tight bound on each rate gap A,({/(}). 1 £ Af. Clearly, if 
£ = 0, we have J\f 2 = A f 2 , meaning that all potential D2D users have been discovered as D2D candidates. 

Having introduced the set Af 2 , we are now in a position to define optimal pairing decisions with imperfect 
CSI. 


Definition 7 (Optimal pairing decisions with imperfect CSI). For given Af 2 (with some e > 0) and Z = 
(zi, ..., zn) (compressed and quantized CSI), we define an optimal scheduling decision S = AffJX C Af\ UA/2 
where X C A f 2 is a solution to the following problem: 

X := arg max T. r{zi,A\JAf\) (13) 

subject to r(zi , A U Aff) > fi + e for all i £ A U A f\ (14) 

where f(Zi,S ) is the estimated achievable rate given by (fTTI) . 
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IV. Rate gap analysis 


For different linear and non-linear channel gain estimators we seek probabilistic bounds on the rate gap 
Aj(5) of the form 

Pr {Aj(«S) > di g(£, e)} < e, (15) 


where di > 0 is a constant that depends on system parameters (e.g. transmit powers, maximum number of 
scheduled users |<S| < n) and g(f. e) is a function of the measurement and quantization noise. For simplicity 
we assume that the quantization noise is bounded ||/r ,||2 < £. 


A. Tail-Estimates for Subgaussian Random Matrices 

The idea behind random pilots in channel probing is that if the amount of (sufficiently) random signaling is 
above a certain threshold, the response of channel is with high probability uniformly close to its expectation. This 
principle is used in various field of high-dimensional geometry, such as random matrix theory and compressed 
sensing. In fact, we proceed here along similar lines as in ED to prove RIP-properties based on concentration 
inequalities (see here also El for more details). 

For an in-depth treatment of this phenomenon, we refer the reader to ED. A concise introduction can be 
found in Throughout this section, we assume that the elements of the measurement matrix $ are chosen 
at random and we impose the following two conditions. 

Assumption 2. The matrix is normalized such that for all a £ C N 

E[||$a|||] = ||a|||. 

Assumption 3. For every a £ C N , the random variable ||<I>a||| is strongly concentrated around its expected 
value, 

p r{|||$a||i - II 21 > ^llalll} < c 0 e _7(e) (16) 

where cq > 0 is a constant, and 7 (e) is a function that depends on the distribution of 


Examples of measurement matrices that satisfy the concentration inequality (IT6l > are matrices with rows that 
are sub-Gaussian distributed isotropic vectors (see e.g. ED)- A real-valued random variable X is called sub- 
Gaussian if there exists a constant c > 0 such that the moment generating function is bounded from above 
by 

E [exp(Xf)] < exp(c 2 £ 2 /2). (17) 


Examples of sub-Gaussian random variables are normally distributed random variables and bounded random 
variables. In particular, if the elements of £ <C MxN are i.i.d. distributed according to (pig ~ CA/”(0,1/M), 


then E 
that 


<& H <& 


= In, and E 


|<&a|||] = a H E 


<& H <& 


Pr {|ll^ a ll2 - IMI2I > e||a.|ll} < 2 exp ^e 2 M 


a = ||a|| 2 . Moreover, it can be shown (see e.g. ED) 

ln( 2 ) - r 


( 18 ) 


The sub-Gaussian assumption does not permit sufficiently structured matrices <1> but the result in l22l shows 
that RIP matrices with additional column randomization provide Johnson-Lindenstrauss embeddings and this 
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in turn implies a certain concentration inequality of type (fl6l >. We do not further elaborate on this here, but 
refer the reader to 113 for more details. 


B. Preliminary Result 

First, we introduce a general result that enables us to bound A,; given by (ITOl) independent of the estimation 
function. To simplify the notation we define the channel gain Xij := \hij\ 2 , the vector of channel gains 
Xi := (xi'i ,..., Xi t N) T and the matrix of channel gains X := (xi ,..., x^). In a similar manner we define 
the estimated channel gains as Xij := (3(zi,j), the vector of estimated channel gains x % := (xi.i .... , £i,jv) T 
and the matrix of estimated channel gains X := (x \...., x ; v). 


Lemma 1. Let the achievable rates r(P , 5, hi) be estimated by ffP. S, Zi ) defined in (111b . For any scheduling 
decision S, with |<S| < n, and any channel gain estimation Xij = /3(zi,j), 

A i(S) = | n{P,S,hi) -?i{P,S,Zi )| < 2Py^ j \x it j -Xij\, 

jes 

holds simultaneously for all i £ S. 


The proof is given in Section [VII-AI To control it is sufficient to control XweAf I^m| 2 — || 2 based on 
the measurements z j = •!>/),+//,, defined in (J7J. Hence, it is not necessary that we recover the vectors h % , for all 
i. Instead, recovery of the vectors x, is sufficient. We stress that this is different from classical estimation theory 
(see e.g. J23l) where based on the measurements Zi minimization of the error || hi — ||§ = Y^ieN — 

is considered. 


C. Non-Linear Rate Estimation 

In this subsection we study a non-linear channel gain estimation function that uses concepts from compressed 
sensing to exploit the structure of the channels. More precisely, we assume that the channel vectors are 
compressible, that is, for some i £ AT the channel vector hi is sparse or has at least fast decaying magnitudes 
(after ordering). Compressibility of a given vector can be quantified by decay order of 

<Tk(x) p := min \\x - x\\ p , 

where E/. := {x € C ;V : |supp(at)| < k} is the set of all /c-sparse vectors. The function a, defined in Definition 
[6] is given by the solution to the convex optimization problem 

a(zi) = argmin||cc||i subject to ||$a; — *i||2<£. (19) 

x&C N 

The parameter ^ must be chosen such that || /x 7 -1| 2 < C We will first review some basic results from compressed 
sensing and then show how these results can be applied to obtain bounds on A,. Compressed sensing recovering 
results can be divided in uniform and nonuniform recovery results. A uniform recovery result means that one 
can recover all fc-sparse vectors - with high probability - from linear measurements with the same matrix. 
Nonuniform recovery means that a fixed fc-sparse vector can be recovered with a randomly drawn measurement 
matrix, with high probability. Uniform recovery results are obviously stronger since they imply nonuniform 
recovery. To streamline the presentation we consider only uniform recovery. 
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One class of uniform recovery results are based on the restricted isometry property (RIP) (see e.g. El) of 
the measurement matrix <h. The RIP is defined as follows. 

Definition 8 . An M x N matrix $ satisfies the RIP of order k > 1, if there exists 0 < 5k such that the 
inequality 

(l-6 k )\\x\\t<\\*x\\l<(l+6 k )\\x\\l 

holds for all x £ £&. The smallest number 5 k = 5k( 4*) is called the restricted isometry constant of the matrix 


Many ensembles of random matrices are known to satisfy the RIP with high probability. An important 
class of random matrices are matrices with elements that are i.i.d. sub-Gaussian distributed. In particular, if 
X ~ CAf(0, a 2 ), then E[exp(Xf)] < exp(cr 2 f 2 /2) and therefore, according to (fT71 >. X is sub-Gaussian. 

For concreteness we assume that the elements of $ are distributed complex Gaussian tpi,] ~ CJ\f( 0,1/M). 
In fact, this assumption enables us to explicitly compute most of the constants that would otherwise depend on 
the distribution of We stress that more general results for sub-Gaussian measurement matrices can be found 
for example in (25], ll24l and references therein. The following theorem which is proved in ll24l Theorem 9.27] 
enables us to bound the RIP constant of <£. To be self contained, we state the theorem in our notation. 

Theorem 1 ( Il24l Theorem 9.27]). Let $ be a random M x N matrix with i.i.d. elements distributed according 
to (jh,j ~ CJ\f(0,1/M). Assume that 

M > 2 rf 2 (k\w(eN/k) + ln(2e" 1 )) , 
with r), £ £ (0,1). Then the RIP constant 5k of $ satisfies 

5k < 2 ( H- , ^ = ) r] + ( 1 H- , ^ = ) p 2 , 

y s/2 ln(eA r /k) J y sj2\n(eN/k) J 

with probability 1 — e. 

As pointed out by ll24l Remark 9.28] the statement of the last theorem can be simplified by using 5k < 5 < 
Cip with Ci = 2(1 + \Jl/2) + (1 + \Jl/25) 2 such that M > 2C 2 <5 -2 (k In (eN/k) + ln( 2 e -1 )) yields 5k < <5. 
According to Lemma Q] we can control the rate gap A,; by controlling j*, — x/\->- If the measurement matrix 
satisfies the RIP of order k with 5k < 1/3, the following theorem provides an error estimate. 

Theorem 2 ( 11261 Theorem 3.3]). Suppose $ satisfies the RIP of order k with 5k < 1/3. Let the measurements 
be given by z = &h + p, according to (0, with ||/r ||2 < £. Then for any h £ C N the solution h = a(z) to 
(IT9] > obeys 

\\h-hh<C 2 (5 k )^^+2C 3 (5 k )^ ( 20 ) 

where C 2 {5) = and C 3 (S) = are constants. 

The theorem is proved in [21 Theorem 3.3]. We stress that many similar error bounds for Problem IT9l and 
related problems are known. The probably most popular error bound was provided in the seminal paper l(T4l . 
which requires that the measurement matrix has a RIP constant 5 2 k < s/2 — 1. A better error bound is given 
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Fig. 1. Bounds on compression ratio M/N over system size N. Maximal compression to achieve perfect reconstruction with probability 
e = 0.9 fixed sparsity k = 10. 


in | [24l Theorem 6.12] where 62k < 4/>/41 is required. Recently E 3 showed that 82k < l/\/2 is sufficient. 
Figure [I] depicts the system size N over the compression ratio M/N for different RIP constants. The number of 
measurements is evaluated according to Theorem Q] To obtain a significantly reduced number of measurements, 
the number of links N must be large. Figure Q] includes also bounds on the number of measurements for 
non-uniform recovery. Non-uniform recovery results provide error bounds for much smaller system sizes N. 
However, we stress that the RIP is only a sufficient condition for recovery. From Theorem |T] Theorem [2] and 
Lemma Q] we devise the following corollary. 

Corollary 1. Let $ be a random MxN matrix with i.i.d. elements distributed according to fij ~ CAf( 0, 1/M). 
Suppose the measurements are given by Zi = <!>/),, + p-, according to (0, with ||/i||2 < £• If 

M > 2CfS~ 2 (kln(eN/k) + ln(2e" 1 )) , 

with S < 6 2k < 1/3 and 11 112 < oti, then for all the solutions to (1191 ) obey 

Pr{3i eff:Ai> 2Pq{h,,£){2ai + g(/ii,£))} < 

with q{hi,£) = C 2 (<^fc) + 2C3 (6k)£ and C2(5), C3(5) > 0 as in Theorem\2\ 

The proof is given in Section IVH-BI We point out that, if the number of measurements are in the order of 
0(kln(eN/k) and, for all i £ Af, the channels hi are fc-sparse (i.e. <Jk{hi)i = 0), then the rate estimation 
error A,; remains bounded. Moreover, in the noiseless case (f = 0) perfect recovery can be achieved. However, 
for both cases the system size N must be sufficiently large as said before and illustrated in Figure Q] 
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D. Linear Rate Estimation 

In this subsection we derive bounds on the rate gap A* for linear channel gain estimation functions. First, 
we prove a general theorem that is valid for any linear estimation function defined in Definition 0 and any 
ensemble of measurement matrices that satisfies the concentration of measure inequality (IT6t . We have the 
following general result, which is the main result in this chapter. 

Theorem 3. Let channel state information be given by any linear estimation function (3(zi,j) = \('$’Zi,ef)\ 2 , 
with \P = A where A is a positive semi-definite matrix. If $ fulfills the concentration inequality (1161) and 
the number of active transmissions is bounded by 1 < |<S| < n, then for any fixed channels H = (h ±,..., h n) 
and any uq > 0 , po > 0 and e > 0 , 


Pr {35 c J\f, |S| = n, 3i £ S : A,(5) > 2P\\hi\\l(4y/n(l + u 0 )e + p 0 )} 

< exp(log(4n 2 ) + n\og(Ne/n) - 7 (e)) + exp(nlog(A r e/n))Pr{s max (^'$) > u 0 } 

+ exp(nlog(We/n))maxPr{||’^/i i || 2 (||’®'/i i ||2 + 2||^$hi|| 2 ) > p 0 } , (21) 

l£AI 

The proof is deferred to Section IVII-CI Clearly the bound depends on the choice of IP and the distribution of 
The latter determines the function 7 (e). However the theorem is rather general and enables the evaluation of 
different linear estimation functions under different assumptions on the channels and under different distributions 
of the measurement matrix <f>. 

To illustrate the strength of Theorem 0 let us assume that the channel vectors are fc-sparse, hi £ for all i, 
and consider the following estimation function and measurement matrix. Let the elements of $ be distributed 
complex Gaussian and define the linear estimation function as 

= |($ + z*,ej)| 2 , (22) 

where is defined as the pseudo inverse <P + = for M < N. We devise the following 

corollary. 


Corollary 2. Under the assumptions of Theorem\3\ Let M < N. Suppose that the elements of & are distributed 
~ Cfif(0, 1/M). Let Pi{zi,j) = |(<P + z,;, ef) | 2 . Assume that || e|| 2 = 0 and for all i £ ff we have hi £ 
and hij ~ CJ\f(0 ,1) for all j £ supp{hf). We have 


Pr{3i£fif : Ai>16Pj — 


with k = 2/(1 — log( 2 )). 


/ 

KTl 

V21r 


4 nN 


(") 


+ 1 


+ k. 


V 


A 



> < £, 


The proof is given in Section [VH-DI A few remarks are in place. For fixed transmit powers P, a fixed system 
size N, a given error probability e and a fixed number of active links n, the rate estimation error scales with 
which is also in accordance with the estimation results in [15} Theorem 4.1], where essentially the 
same scaling is achieved. As was expected, the linear decoding function is not able to achieve perfect recovery 
(for M < N). Perfect recovery can only be achieved by the compressed sensing based decoder but comes at 
the cost of additional complexity. However, the simulations in the next section show that the linear decoder 
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performs reasonably well when applied to a small systems. Moreover, a linear decoder can be used to perform 
a subset selection and to reduce the problem size for non-linear algorithms. 

V. Numerical Examples 

We consider a cellular system with one base station and 25 users. Every node has a single antenna. The 
users are grouped in G user groups Q g , g = 1.2..... G. Users within the same user group experience the same 
path loss. The channels from users in i £ Qf to users j G Q g are given by 

hj,i = cigjbj'i G C, (23) 

where bjj ~ CAf(0, 1) denotes the small scale fading coefficient and a g j denotes the distance dependent 
path loss coefficient, with a gg = 1 for all g. A similar channel model was used in l28l to model large 
cellular networks with co-located users. Under certain assumptions the channel matrix H is compressible. 
More precisely, the matrix H can be approximated by a low rank and/or sparse matrix H , if the user groups 
Q g are of sufficient size and/or the path loss coefficients a r>g decay sufficiently fast. 

We compare two setups: i) 5 groups of 5 users each, the path loss coefficients are chosen as 10 z ' 10 , with z 
uniformly distributed in [0,1]. ii) 25 users all in the same group and path loss coefficient is a lgj = 1 for all i. g, 
i.e., all channels are i.i.d. complex Gaussian distributed. The rate requirement is set to f = l/101og(l + P). 
Problem [l9l was solved using the Tfocs toolbox (29). 

We compare the solution to problem (IT3l) for the non-linear compressed sensing estimation function ( 1 1 91 ) and 
the linear estimation function (l22l) . In the simulations e = 0, since the analytic results do not give tight bounds 
for systems with N = 25. Nevertheless the results in Figure [2] show that linear estimation performs very close 
to the much more complex compressed sensing based estimation. Figure [3 shows that if the channel matrix is 
compressible the compressed sensing estimation function performs better than the linear estimation function. 
Since the considered systems are rather small it can be expected that the gain of compressed sensing increases 
for larger systems. 


VI. Conclusion 

We developed a channel sensing and reconstruction protocol that enables the network controller to estimate 
the achievable rates based on compressed non-adaptive measurements. The scaling of the estimation error at 
the network controller has been analyzed for linear and non-linear decoding functions. Scaling results for the 
non-linear decoding function where shown to follow from well known compressed sensing results. However, 
for a small to moderate system size N the compressed sensing results do not provide reasonable performance 
bounds. For linear decoding functions we derived a general result which can be used to analyze the performance 
of a variety of linear decoding functions and measurement matrices. For a linear decoding function based on 
the pseudo inverse and Gaussian measurement matrices we investigated the scaling of the rate estimation error 
with the number of measurements. 

The measurement protocol is based on a few simplifications which render the direct application in practical 
systems rather difficult. For example, the assumption of perfect time and frequency synchronization is hard 
(if not impossible) to achieve in distributed networks with a huge number of devices. To this end, the analog 
coding developed in lfl3l can be used to relax the requirements on the synchronization. 
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Fig. 2. Average sum-rate over compression factor M/N ; Setup: 25 users, 1 base station, perfect feedback channel (no feedback and 
quantization noise), single group; channel matrix i.i.d. Gauss and not compressible. Comparison of linear and non-linear rate estimation. 



Fig. 3. Average sum-rate over compression factor M/N; Setup: 25 users, 1 base station, perfect feedback channel (no feedback and 
quantization noise), 5 group of 5 users each; channel matrix compressible, single group; channel matrix i.i.d. Gauss and not compressible. 
Comparison of linear and non-linear rate estimation. 


Future work may also include the exploration of different linear and non-linear decoding functions. To this 
end, Theorem [3] provides a good basis to evaluate different linear decoding functions. For non-linear decoding 
functions applications of matrix recovery and other compressed sensing related approaches are a promising 
research direction. Extensions to other network architectures are another prospective direction. Coordinated 
transmission techniques where groups of devices (or antennas) are jointly transmitting with beamforming vectors 
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w given by some finite codebook can be analyzed with the proposed framework by estimating \(hi,w)\. 
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VII. Proofs 

A. Proof of Lemma [7] 

Proof: For each i the corresponding rate gap A, can rewritten using the abbreviations L(s) := log(l + s), 

Qj '■= Pj\hij \ 2 and qj := Pj\h itj \ 2 : 


A,- = 


L 


log 




1 + Ezgs\W © 
1 + Ezgs © 


- L 




= L 


1 + EzgS © 

Ezgs © ~ © 

1 + EzgS © 

EieS © - © 


log 



L 


V 1 + Ez e s\{i} 

1 + E;g 5 \{i} © \ 

1 + EzgS\{i} © / 

Ezg 5 \{»} © ~~ © \ 

1 + EiG5\{i} © / 

/ Eig«S\{i} © ~ © \ 

V 1 + EZGS\{i} © / 


(24) 


where the first inequality follows from the triangle inequality and the second inequality follows from Jensen’s 
inequality and the fact that the denominators are positive. Since, L(x) < x for x > 0 and by assumption 
Pj < P, for all j, we obtain the first claim 

A i <2 P^||V| 2 -|^| 2 |- 

ZGS 


B. Proof of Corollary [7] 

Proof: Using Lemma Q] the Cauchy-Schwarz inequality and the reverse triangle inequality we get 


A j A 2 P / ' | %i,j 1 

jes 

(25) 

<2PY,\(\ h <A-\hM\Kj\ + \hj\)\ 

(26) 

<2 P hi-hi \hi \ + \hi\ 

2 2 

(27) 

<2 P hi — hi\ (2\\hi\\2 + hi-hi V 

2 V 2 / 

(28) 


By assumption M > 2C 2 S~ 2 (k\n(eN/k) + ln(2e~ 1 )), with S < 1/3, such that <1? satisfies the RIP with 
probability at least 1 — e. Hence, we can use Theorem [2] and plug (l20l > in (l28b . Finally, defining q(h,. f) = 
C 2 (Sk) + 2C3the claim follows. ■ 

C. Proof of Theorem [7] 

The prove of Theorem [3] is developed in several steps. 

Lemma 2. Let X and Y be two non-negative real random variables. // / : I X 1 -> I is monotonically 
increasing in the second input and yo > 0 is a positive constant, then 

Pr {/(A', Y") > e} < min Pr {/(A, y 0 ) > e} + Pr {Y > y 0 } . 
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Proof: First assume that the random variable Y is bounded by Y > yo. In this case the claim is trivially 
true, since Pr { Y > y 0 } = 1. Therefore, assume that Pr {Y < y 0 } > 0. We will abbreviate Z = f(X,Y ) and 
Zq = f(X, yo). For any arbitrary but fixed yo > 0 we have, 

Pr {Z > e} = 1 — Pr {Z < e\Y < y 0 } - Pr {Z < e\Y > yo} 

< 1 - Pr {Z < e\Y < yo} < 1 - Pr {Z 0 < e\Y < y 0 } 

Pr{{^o<£}n{y <t/o}} 

Pr {Y < yo} 

_ 1 ~ Pr {{Zq > g} U {Y > ?/ 0 }} (29) 

Pr {Y < yo} 

< 1 — Pr {Zg > e} — Pr {Y > yp} 

Pr{y < yo} 

<Pr{Z 0 >e} + Pr{Y>yo}, 

where we first used De Morgan’s law and then the union bound. ■ 


Lemma 3. Let V = {tti,. .., v n } C be an arbitrary but fixed set of mutually orthogonal vectors (n < N), 

A G C NxM and A G C MxM be a positive semi-definite matrix. If w = + e and & is a M x N 

random matrix that is isotropically distributed and satisfies the concentration inequality m, then for any fixed 
u G S^ -1 and any fixed e G C M 


Pr 






> 4y/n(l + uo)e + po 


< An exp (—7 (e)) + Pr {s max (^$) > w 0 } 


(30) 


+ Pr{||¥e|| 2 (||¥e|| 2 + 2||¥$u|| 2 ) > Po } 


holds, where 7 (e) depends on the distribution of $ and po,up > 0 are positive constants. 


Proof: Consider the vectors a, 6, c G C n with elements a, = (u, vf), bi = vf) and Cj = (\Pe, vf). 

Obviously ||a|| 2 < 1, ||b|| 2 < ||’®'$m|| 2 and ||c|| 2 < ||Sl>e|| 2 . 

n n 

D :=E ||ai| 2 - \bi + a\ 2 \ = E 11a.i| 2 - |&i| 2 - \d\ 2 - 2$l(biCi)\ 

i =1 2=1 

n 

— 'y v |\ a i\ 2 — I ^2 1 2 1 + |cj| 2 + 

2=1 

n 

<ElN 2 ^N 2 | + H(l + 2||b||) (31) 

2=1 
n 

= E Kl ai l “ N)(l a *l + N)l + Il c ll2(l + 2 ||b|| 2 ) 

2=1 

< |||a| - |6||| p • |||a| + |b||| 9 + ||c|| 2 (l + 2||b|| 2 ) 

<||a-6|| p .(H|, + ||6||,) + ||c|| 2 (l + 2||6|| 2 ) 

Recall, that b and c are random vectors. We apply now Lemma [2] twice. First, for the non-negative random 
variables X = ||a - b|| p • (||a|| 9 + ||b|| 9 ) and Y = ||c|| 2 (l + 2||b|| 2 ), for any 0 < po, we have, 

Pr {D > e'} < Pr{||a- b||p- (||a|| g + ||b|| g ) + po > e'} +Pr{||c||(l + 2||b||) > p 0 } ■ (32) 

(<) 
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Second, for X = ||a — b|| p and Y = ||a|| g + ||b||g, for any yo > 0, we have, 
(i) = Pr{||a - 6|| p • (||a|| g + ||b|| g ) > e' - Po } 


< Pr < ||a- b||p > 


£ - Po 
Vo 


+ Pr{||a|| 


> yo} 


(33) 


By assumption \P = <S> n A = <§> H A 1 ^ 2 A 1 ^' 2 , where A 1 1 is the principal square root of A. From the definition 
of a and b it follows that 


11° - b \\ P = IKI(«,Wi) ^ ( B u, Bvi)\}2 =1 \\ p < ||m||p, 

where the n components in-, of the vector m follow from the polarization identity as, 

1 


(34) 


I a* - h | = 


?(ll M + ^ill2 - \\ B (u + 

«e{±i,±i} 


<T W u + £. v i\\l - \\B(u + £v 


i) 112 


£e{±l,±i} 


(35) 


Thus, we have 


< max 

£e{±i,±i} 


(i) < Pr < ||m|L > 


|w + ^i|l2 - ||-B(tt + ^Ui)||| 


£ ~ Po 
Vo 


- Pr {Hall 


> Vo}- 


(36) 


Next, we use p = oo, q = 1, ||a||i < yfn and ||b||i < -^/n||b|| 2 . By assumption $ is isotropically distributed, i.e., 
each component of m has the same distribution. Thus, 11 m 11 is the maximum over 4 n identically distributed 
random variables. Define uq = yo } \fn - 1 . Using the union bound and the concentration inequality (IT6l > we 
have. 


(i) < 4?rPr < |toi | > 


£ - Po 
Vo 


Pr 


= 4nPr < |mi | > 


< 4 n exp(—7(- 


£ - Po 
Vn(uo + 1) 
e' ^ Po 


U + Vl\\l^/n(l + U 0 ) 


|2 > -% - 1 
\Jn 


Pr{||b|| 2 > u 0 } 

)) +Pr{||b|| 2 > u 0 } 


(37) 


< 4nexp(—7( ~ P ° —-)) + Pr{||b|| 2 > u 0 } 


'4^(1 + Uq)' 

< 4nexp(—7(e)) + Pr{||b|| 2 > zto} 

The last steps follow from ||ix + Ui|| 2 < 4 and with e' = eAy/n{\ + uo) + P q. Since Pr{||\t , 3?u|| 2 > tio} < 
Pr{s max ( , ®'$) > zto} the claim follows from the last equation and (l32t . ■ 

Now we are ready to prove Theorem [3] 

Proof of Theorem QJ Let S CAT be arbitrary but fixed. By the assumptions the rate gap bound in Lemma 
|T|can be rewritten as 

AiGS) < 2P||b,||^ iKhi.e,)! 2 - !<¥(*&< +Mi), e,)| 2 |, 

les 

where we defined hi = hi/\\h t || 2 and /i^ = ^tj/||b,i|| 2 . If we fix |<S| = n. Lemma [3 yields 

Pr { A;(<S) > 2P\\hi\\l(Ay/n(l + u 0 )e + po)} < An exp(-7(e))+ 

Pr{s max (\P^>) > iio} + Pr{||«Ail |2 (||*£il |2 + 2||¥*hi|| 2 ) > Po } , (38) 
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for an arbitrary i e S. Taking the union bound over all i G S yields. 


Pr{3z e S : Aj (S) > 2P\\h,\\l{Ay/n(l + tt 0 )e: + po)} < 4n 2 exp(-7(e))+ 

nPr{s max (**) >w 0 } + E Pr {ll^ll 2 (ll V I / ^ll 2 + 2 ll' I, ^ll 2 ) >A)}, (39) 

i£«S 

Finally, applying the union bound over all ({{) scheduling decisions S C J\f, with \S\ = n , 


Pr {35 C AT, |S| = n,3i G 5 : A,(5) > 2P\\h l \\\{'iy/n(l + u 0 )e + p 0 )} 

< exp(log(4n 2 ) + nlog(Ae/n) - 7 (e)) + exp(nlog(Ae/n))Pr{s max (\I>3>) > m 0 } 

+ exp(nlog(Ae/n))maxPr{||’S'/i l || 2 (||« , Aill 2 + 2||'®'$h i || 2 ) > po} , (40) 

where we used ({{) < ( Ne/n) n . 


D. Proof of Corollary \2\ 

The following result will be useful in the proof. Let a be a random vector with elements aj ~ CJ\f(0, 1). 
Then, for all t > 0, 

Pr {IMI 2 — E [||a|| 1] > t} < exp(-f 2 /2). (41) 

In fact, this is a special case of the concentration of measure theorem for Lipschitz functions, see Il24l Theorem 
8.40]. 

Proof: For an arbitrary but fixed h,. Setting eo = 0, we have 

Pr{||*e|| 2 (||*e|| 2 + 2||¥$u|| 2 ) > e 0 } = 0. 

Since <h + <I? is a projector (i.e. Hermitian and idempotent) s max (3> + $) = 1, and therefore we can set uq = I 
and obtain Pr {s max (\l/$) > 1} = 0. Using (fl8l > we get from Theorem [ 3 ] 

Pr{35 C M, |5| < n < N/2 : Aj > 16P||/r l ||^e / )} < 4 nN exp (-Me , 2 /ac) , 

with k = Since hi is also random we can use Lemma [2] and get 

Pr{3i <E TV : A* > WPhoy/n.e')} < 4 nN ex P (—Me' 2 / k) + Pr{||foj|| 2 > ho} . 

By assumption we have E [||ft,j|||] = k. Thus, (|4H gives, 

Pr {ll^i|l 2 > M =Pr{||hi||2 >t + k}< exp(—f 2 /2). 

Hence, if we set ho = t + k and t = y/2Me' 2 / n, 

Pr |3i G M : Aj > \^P2Me' 2 j n + fc)e') j < ^4 nN + 1^ exp (— Me' 2 /tf) . 

Finally, setting e = (4 nN + 1) exp (— Me' 2 / k) the claim follows. ■ 
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